AITopics

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (0.67)
(2 more...)

Neural Information Processing SystemsFeb-15-2026, 21:49:15 GMT

74fa9e6bc36aa567fe7cf002b733a30d-Paper-Conference.pdf

machine learning, mechanism, natural language, (20 more...)

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(2 more...)

Liu, Hang, Scaglione, Anna, Peisert, Sean

Differentially Private Distribution Release of Gaussian Mixture Models via KL-Divergence Minimization

arXiv.org Artificial IntelligenceNov-11-2025

--Gaussian Mixture Models (GMMs) are widely used statistical models for representing multi-modal data distributions, with numerous applications in data mining, pattern recognition, data simulation, and machine learning. However, recent research has shown that releasing GMM parameters poses significant privacy risks, potentially exposing sensitive information about the underlying data. In this paper, we address the challenge of releasing GMM parameters while ensuring differential privacy (DP) guarantees. Specifically, we focus on the privacy protection of mixture weights, component means, and covariance matrices. We propose to use Kullback-Leibler (KL) divergence as a utility metric to assess the accuracy of the released GMM, as it captures the joint impact of noise perturbation on all the model parameters. T o achieve privacy, we introduce a DP mechanism that adds carefully calibrated random perturbations to the GMM parameters. Through theoretical analysis, we quantify the effects of privacy budget allocation and perturbation statistics on the DP guarantee, and derive a tractable expression for evaluating KL divergence. We formulate and solve an optimization problem to minimize the KL divergence between the released and original models, subject to a given ( ϵ, δ) -DP constraint. Extensive experiments on both synthetic and real-world datasets demonstrate that our approach achieves strong privacy guarantees while maintaining high utility. In recent years, the remarkable success of data-driven artificial intelligence (AI) has spurred an increasing demand for the sharing and analysis of large-scale, multi-class, and high-dimensional datasets across a variety of domains, such as healthcare records, consumer transactions, and mobility traces. Organizations have recognized the potential of sharing data statistics to enhance data mining, improve public services, optimize recommendations, and facilitate data simulation [ 1 ]. However, sharing raw data or even their statistics raise significant privacy concerns, especially when sensitive attributes of individuals might be inferred. This research was supported in part by the Director, Cybersecurity, Energy Security, and Emergency Response (CESER) office of the U.S. Department of Energy, via the Privacy-Preserving, Collective Cyberattack Defense of DERs project, under contract DE-AC02-05CH11231.

data mining, kl divergence, machine learning, (19 more...)

2506.03467

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.74)
Government > Military > Cyberwarfare (0.54)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 12:19:25 GMT

a68120d2eb2f53f7d9e71547591aef11-Paper-Conference.pdf

mechanism, ppr, privacy, (15 more...)

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (0.67)
(2 more...)

Neural Information Processing SystemsOct-10-2025, 06:22:24 GMT

Noise-Aware Differentially Private Regression via Meta-Learning

While Differential Privacy (DP) is the gold standard for protecting user privacy, standard DP mechanisms typically significantly impair performance.

dpconvcnp, functional mechanism, mechanism, (16 more...)

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-15-2025

Balancing Utility and Privacy: Dynamically Private SGD with Random Projection

Jiang, Zhanhong, Hasan, Md Zahid, Saadati, Nastaran, Balu, Aditya, Liu, Chao, Sarkar, Soumik

Stochastic optimization is a pivotal enabler in modern machine learning, producing effective models for various tasks. However, several existing works have shown that model parameters and gradient information are susceptible to privacy leakage. Although Differentially Private SGD (DPSGD) addresses privacy concerns, its static noise mechanism impacts the error bounds for model performance. Additionally, with the exponential increase in model parameters, efficient learning of these models using stochastic optimizers has become more challenging. To address these concerns, we introduce the Dynamically Differentially Private Projected SGD (D2P2-SGD) optimizer. In D2P2-SGD, we combine two important ideas: (i) dynamic differential privacy (DDP) with automatic gradient clipping and (ii) random projection with SGD, allowing dynamic adjustment of the tradeoff between utility and privacy of the model. It exhibits provably sub-linear convergence rates across different objective functions, matching the best available rate. The theoretical analysis further suggests that DDP leads to better utility at the cost of privacy, while random projection enables more efficient model learning. Extensive experiments across diverse datasets show that D2P2-SGD remarkably enhances accuracy while maintaining privacy. Our code is available here.

artificial intelligence, data mining, machine learning, (18 more...)

2509.09485

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Aguilera-Martínez, Francisco, Berzal, Fernando

Differential Privacy in Machine Learning: From Symbolic AI to LLMs

arXiv.org Artificial IntelligenceJun-16-2025

Machine learning models should not reveal particular information that is not otherwise accessible. Differential privacy provides a formal framework to mitigate privacy risks by ensuring that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm, thus limiting the exposure of private information. This survey paper explores the foundational definitions of differential privacy, reviews its original formulations and tracing its evolution through key research contributions. It then provides an in-depth examination of how DP has been integrated into machine learning models, analyzing existing proposals and methods to preserve privacy when training ML models. Finally, it describes how DP-based ML techniques can be evaluated in practice. %Finally, it discusses the broader implications of DP, highlighting its potential for public benefit, its real-world applications, and the challenges it faces, including vulnerabilities to adversarial attacks. By offering a comprehensive overview of differential privacy in machine learning, this work aims to contribute to the ongoing development of secure and responsible AI systems.

artificial intelligence, machine learning, neural information processing system, (20 more...)

2506.11687

Country:

North America > United States > New York > New York County > New York City (0.15)
Europe > Austria > Vienna (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.14)
(39 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(6 more...)

arXiv.org Artificial IntelligenceMar-26-2025

Purifying Approximate Differential Privacy with Randomized Post-processing

Lin, Yingyu, Wang, Erchi, Ma, Yi-An, Wang, Yu-Xiang

We propose a framework to convert $(\varepsilon, \delta)$-approximate Differential Privacy (DP) mechanisms into $(\varepsilon, 0)$-pure DP mechanisms, a process we call ``purification''. This algorithmic technique leverages randomized post-processing with calibrated noise to eliminate the $\delta$ parameter while preserving utility. By combining the tighter utility bounds and computational efficiency of approximate DP mechanisms with the stronger guarantees of pure DP, our approach achieves the best of both worlds. We illustrate the applicability of this framework in various settings, including Differentially Private Empirical Risk Minimization (DP-ERM), data-dependent DP mechanisms such as Propose-Test-Release (PTR), and query release tasks. To the best of our knowledge, this is the first work to provide a systematic method for transforming approximate DP into pure DP while maintaining competitive accuracy and computational efficiency.

artificial intelligence, machine learning, mechanism, (15 more...)

2503.21071

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Michigan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

arXiv.org Artificial IntelligenceDec-26-2024

RAG with Differential Privacy

Grislain, Nicolas

Retrieval-Augmented Generation (RAG, (Lewis et al. 2021)) has become a popular approach to enhance the capabilities of Large Language Models (LLMs) by supplying them with up-to-date and pertinent information. This method is particularly valuable in environments where knowledge bases are large and rapidly evolving, such as news websites, social media platforms, or scientific research databases. By integrating fresh context, RAG helps mitigate the risk of "hallucinations"--instances where the model generates plausible but factually incorrect information--and significantly improves the overall quality and relevance of the responses generated by the LLM. However, incorporating external documents into the generation process introduces substantial privacy concerns. When these documents are included in the input prompt for the LLM, there is no foolproof way to ensure that the generated response will not accidentally reveal sensitive or confidential data (Qi et al. 2024). This potential for inadvertent data exposure can lead to serious breaches of privacy and presents significant ethical challenges. For instance, if an LLM is used in a healthcare setting and it accidentally includes patient information from an external document in its response, it could violate patient confidentiality and legal regulations. This paper describes a practical solution (DP-RAG) aimed at addressing these privacy concerns with Differential Privacy (DP).

large language model, machine learning, natural language, (19 more...)

2412.19291

Genre: Research Report (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Artificial IntelligenceJul-23-2024

Universally Harmonizing Differential Privacy Mechanisms for Federated Learning: Boosting Accuracy and Convergence

Feng, Shuya, Mohammady, Meisam, Hong, Hanbin, Yan, Shenao, Kundu, Ashish, Wang, Binghui, Hong, Yuan

Differentially private federated learning (DP-FL) is a promising technique for collaborative model training while ensuring provable privacy for clients. However, optimizing the tradeoff between privacy and accuracy remains a critical challenge. To our best knowledge, we propose the first DP-FL framework (namely UDP-FL), which universally harmonizes any randomization mechanism (e.g., an optimal one) with the Gaussian Moments Accountant (viz. DP-SGD) to significantly boost accuracy and convergence. Specifically, UDP-FL demonstrates enhanced model performance by mitigating the reliance on Gaussian noise. The key mediator variable in this transformation is the R\'enyi Differential Privacy notion, which is carefully used to harmonize privacy budgets. We also propose an innovative method to theoretically analyze the convergence for DP-FL (including our UDP-FL ) based on mode connectivity analysis. Moreover, we evaluate our UDP-FL through extensive experiments benchmarked against state-of-the-art (SOTA) methods, demonstrating superior performance on both privacy guarantees and model performance. Notably, UDP-FL exhibits substantial resilience against different inference attacks, indicating a significant advance in safeguarding sensitive data in federated learning environments.

mechanism, privacy, udp-fl, (14 more...)

2407.1471

Country:

North America > United States > Connecticut > Tolland County > Storrs (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.85)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)